Detecting non-causal artifacts in multivariate linear regression models
نویسندگان
چکیده
We consider linear models where d potential causes X1, . . . , Xd are correlated with one target quantity Y and propose a method to infer whether the association is causal or whether it is an artifact caused by overfitting or hidden common causes. We employ the idea that in the former case the vector of regression coefficients has ‘generic’ orientation relative to the covariance matrix ΣXX of X . Using an ICA based model for confounding, we show that both confounding and overfitting yield regression vectors that concentrate mainly in the space of low eigenvalues of ΣXX .
منابع مشابه
Application of non-linear regression and soft computing techniques for modeling process of pollutant adsorption from industrial wastewaters
The process of pollutant adsorption from industrial wastewaters is a multivariate problem. This process is affected by many factors including the contact time (T), pH, adsorbent weight (m), and solution concentration (ppm). The main target of this work is to model and evaluate the process of pollutant adsorption from industrial wastewaters using the non-linear multivariate regression and intell...
متن کاملPrediction of Blasting Cost in Limestone Mines Using Gene Expression Programming Model and Artificial Neural Networks
The use of blasting cost (BC) prediction to achieve optimal fragmentation is necessary in order to control the adverse consequences of blasting such as fly rock, ground vibration, and air blast in open-pit mines. In this research work, BC is predicted through collecting 146 blasting data from six limestone mines in Iran using the artificial neural networks (ANNs), gene expression programming (G...
متن کاملModeling of temperature in friction stir welding of duplex stainless steel using multivariate lagrangian methods, linear extrapolation and multiple linear regression
In this study, the temperature in friction stir welding of duplex stainless steel has been investigated. At first, temperature estimation was modeled and estimated at different distances from the center of the stir zone by the multivariate Lagrangian function. Then, the linear extrapolation method and multiple linear regression method were used to estimate the temperature outside the range and ...
متن کاملThe Family of Scale-Mixture of Skew-Normal Distributions and Its Application in Bayesian Nonlinear Regression Models
In previous studies on fitting non-linear regression models with the symmetric structure the normality is usually assumed in the analysis of data. This choice may be inappropriate when the distribution of residual terms is asymmetric. Recently, the family of scale-mixture of skew-normal distributions is the main concern of many researchers. This family includes several skewed and heavy-tailed d...
متن کاملModeling of temperature in friction stir welding of duplex stainless steel using multivariate lagrangian methods, linear extrapolation and multiple linear regression
In this study, the temperature in friction stir welding of duplex stainless steel has been investigated. At first, temperature estimation was modeled and estimated at different distances from the center of the stir zone by the multivariate Lagrangian function. Then, the linear extrapolation method and multiple linear regression method were used to estimate the temperature outside the range and ...
متن کامل